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Transmitting important bits and sailing high radio 
waves: a decentrahzed cross-layer approach to 
cooperative video transmission 

Nicholas Mastronarde, Francesco Verde, Donatella Darsena, Anna Scaglione, and Mihaela van der Schaar 

Abstract 

We investigate the impact of cooperative relaying on uplink and downlink multi-user (MU) wireless video 
transmissions. The objective is to maximize the long-term sum of utilities across the video terminals in a decentralized 
fashion, by jointly optimizing the packet scheduling, the resource allocation, and the cooperation decisions, under the 
assumption that some nodes are willing to act as cooperative relays. A pricing-based distributed resource allocation 
framework is adopted, where the price reflects the expected future congestion in the network. Specifically, we formulate 
the wireless video transmission problem as an MU Markov decision process (MDP) that explicitly considers the 
cooperation at the physical layer and the medium access control sublayer, the video users' heterogeneous traffic 
characteristics, the dynamically varying network conditions, and the coupling among the users' transmission strategies 
across time due to the shared wireless resource. Although MDPs notoriously suffer from the curse of dimensionality, 
our study shows that, with appropriate simplications and approximations, the complexity of the MU-MDP can be 
significantly mitigated. Our simulation results demonstrate that integrating cooperative decisions into the MU-MDP 
optimization can increase the resource price in networks that only support low transmission rates and can decrease 
the price in networks that support high transmission rates. Additionally, our results show that cooperation allows users 
with feeble direct signals to achieve improvements in video quality on the order of 5 — 10 dB peak signal-to-noise 
ratio (PSNR), with less than 0.8 dB quality loss by users with strong direct signals, and with a moderate increase 
in total network energy consumption that is significantly less than the energy that a distant node would require to 
achieve an equivalent PSNR without exploiting cooperative diversity. 

Index Terms 

Cooperative communications, cross-layer optimization, decode-and-forward relaying, Markov decision process 
(MDP), multi-user scheduling, resource allocation, wireless video transmission. 

I. Introduction 

Existing wireless networks provide dynamically varying resources with only limited support for the Quality of 
Service (QoS) required by delay-sensitive, bandwidth-intense, and loss-tolerant multimedia applications. This problem 
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is further exacerbated in multi-user (MU) settings because they require multiple video streams, with heterogeneous 
traffic characteristics, to share the scarce wireless resources. To address these challenges, a lot of research has focused 
on MU wireless communication [1], [2], [3], [4], [5] and, in particular, MU video streaming over wireless networks [6], 
[7], [8], [9], [10]. The majority of this research relies on cross-layer adaptation to match available system resources 
(e.g., bandwidth, power, or transmission time) to application requirements (e.g., delay or source rate), and vice versa. 
In MU video streaming applications [6], [7], [8], [9], [10], for example, cross-layer optimization is deployed to 
strike a balance between scheduling lucky users who experience very good fades, and serving users who have the 
highest priority video data to transmit. This tradeoff is important because rewarding a few lucky participants, as 
opportunistic multiple access policies do [2], [3], [4], does not translate to providing good quality to the application 
(APP) layer Unfortunately, with the exception of [5], [11], the aforementioned research assumes that wireless users 
are noncooperative. This leads to a basic inefficiency in the way that the network resources are assigned: indeed, 
good fades experienced by some nodes can go to waste because users with higher priority video data, but worse 
fades, get access to the shared wireless channel. 

A way to not let good fades go to waste is to enlist the nodes that experience good fades as cooperative helpers, 
using a number of techniques available for cooperative coding [12], [13], [14]. As mentioned above, this idea has 
been considered in [5], [1 1]. In [1 1], for example, a cross-layer optimization is proposed involving the physical (PHY) 
layer, the medium access control (MAC) sublayer, and the APP layer, where layered video coding is integrated with 
randomized cooperation to enable efficient video multicast in a cooperative wireless network. However, because it is a 
multicast system, there is no need for an optimal multiple-access strategy, and no need to worry about heterogeneous 
traffic characteristics. In [5], a centralized network utility maximization (NUM) framework is proposed for jointly 
optimizing relay strategies and resource allocations in a cooperative orthogonal frequency-division multiple-access 
(OFDMA) network. In both [5], [11], it is assumed that each user has a static utility function of the average 
transmission rate, where the utility derived by each user in [11] is a function of the average received rate of the base 
and enhancement layer video bitstreams. 

Unlike the aforementioned solutions, we take a dynamic optimization approach to the cooperative MU video 
streaming problem. In particular, unlike [5], [11], the solution that we adopt explicitly considers packet-level video 
traffic characteristics (instead of flow-level) and dynamic network conditions (instead of average case conditions). Our 
solution is inspired by the cross-layer resource allocation and scheduling solution in [10], in which the MU wireless 
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video streaming problem is modeled and solved as an MU Markov decision process (MDP) that allows the users, 
via a uniform resource pricing solution, to obtain long-term optimal video quality in a distributed fashion. However, 
although we use the traffic model and dual decomposition proposed in [10], cooperation renders our PHY/MAC 
model completely different from that studied in [10], thus opening additional research issues with respect to [10], 
such as how the cooperation decision should be made, what is the impact of cooperation on the resource price, and 
what is the impact of cooperation on the total network energy consumption. Moreover, as recently shown in [15], 
augmenting the framework developed in [10] to also account for cooperation is challenging because of the complexity 
of the resulting cross-layer MU-MDP optimization. 

The contributions of this paper are fourfold. First, we formulate the cooperative wireless video transmission problem 
as an MU-MDP using a time-division multiple-access (TDMA)-lLke network, randomized space-time block coding 
(STBC) [16], and a decode-and-forward cooperation strategy. To the best of our knowledge, we are the first to consider 
cooperation in a dynamic optimization framework. We show analytically that the decision to cooperate can be made 
opportunistically, independently of the MU-MDP. Consequently, each user can determine its optimal scheduling policy 
by only keeping track of its experienced cooperative transmission rates, rather than tracking the channel statistics 
throughout the network. Second, in light of the fact that opportunistic cooperation is optimal, we propose a low 
complexity opportunistic cooperative strategy for exploiting good fades in an MU wireless network. The key idea is 
that nodes can, in a distributed manner, self-select themselves to act as cooperative relays. The proposed self-selection 
strategy requires a number of message exchanges that is linear in the number of video sources, and selects sets of 
cooperative relays in such a way that cooperation can be guaranteed to be better than direct transmission. Third, we 
show experimentally that users with feeble direct signals to the access point (AP) are conservative in their resource 
usage when cooperation is disabled. In contrast, when cooperation is enabled, users with feeble direct signals to 
the AP use cooperative relays and utilize resources more aggressively. Consequently, the uniform resource price 
that is designed to manage resources in the network tends to increase when cooperation is enabled in a network 
that only supports low transmission rates, but tends to decrease when it is enabled in a network that supports high 
transmission rates. Fourth, we study the impact of cooperation on the total network energy consumption. We show 
that the increased transmission rate afforded by cooperation requires an increase in total network energy relative to 
the lower rate direct transmission; however, this increase is moderate compared to the amount of power required to 
transmit directly to the access point at a transmission rate equivalent to the cooperative rate. 
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The remainder of the paper is organized as follows. We introduce the system and application models in Sec- 
tion II. In Section III, we provide expressions for the transmission rate, packet error rate, and network energy 
consumption in both direct and cooperative transmission modes. In Section IV, we present the proposed MU cross- 
layer PHY/MAC/ APP optimization. In Section V, we propose a distributed protocol for opportunistically recruiting 
cooperative relays. Finally, we report numerical results in Section VI and conclude in Section VII. 

II. System Model 

We consider a network composed of M users streaming video content over a shared wireless channel to a single 
AP (see Fig. 1). Such a scenario is typical of many uplink media applications, such as remote monitoring and 
surveillance, wireless video sensors, and mobile video cameras. The proposed optimization framework can also be 
used for downlink applications, where the relays can be recruited for streaming video to a certain user in the network 
in exactly the same way that they can be recruited to transmit to the AP in the uplink scenario. In Subsection II-A, 
we introduce the MAC and PHY layer models. Then, in Subsection II-B, we describe the deployed APP layer model. 

A. MAC and PHY layer models 

We assume that time is slotted into discrete time-intervals of length R > seconds and each time slot is indexed 
by t e N.^ At the MAC sublayer, the users access the shared channel using a TDMA-like protocol. In each time 
slot t, the AP endows the ith user, for i G {1,2, •• • ,M}, with the resource fraction xj, where < < 1, such 
that the user can use the amount of channel time R x\ for transmission. Let xt = {x l,x1,..., xf'y G M^^ denote 
the resource allocation vector at time slot t, which must satisfy the stage resource constraint ||xj|li = X]f=i — 
where the inequality accounts for possible signaling overhead. 

Each node's PHY layer is assumed to be a single-carrier single-input single-output system designed to handle 
quadrature amplitude modulation (QAM) square constellations, with a (fixed) symbol rate of l/Tg symbols per 
second. The PHY layer can support a set of + 1 data rates /?„ = 6n/7s (bits/second), where 6„ = log2(M„) is 
the number of bits that are sent every symbol period, with ?i G {0, 1, . . . , N}, and Mn is the number of signals in 

' The fields of complex, real, and nonnegative integer numbers are denoted with C, R, and N, respectively; matrices [vectors] are denoted 
with upper [lower] case boldface letters (e.g., A or x); the field of m x n complex [real] matrices is denoted as C™^" [R™^"], with 
[R™] used as a shorthand for C™^^ [R'"'^^]; the superscript T denotes the transpose of a vector; |-| denotes the magnitude of a complex 
number; ||x||i is the l\ norm of the vector x G C", which for positive real-valued vectors is simply the sum of the components, whereas ||x||2 
is the Euclidean norm of x £ C"; {A}ij indicates the (i + 1, j + l)th element of the matrix A G C™^", with i G {0, 1, . . . , m — 1} and 
j G {0, 1, . . . , n — 1}; a circular symmetric complex Gaussian random variable X with mean /i and variance is denoted as X ~ CJ\f{jj,, a^); 
[■J and [■] denote flooring- and ceiling-integer, respectively; E[-] stands for ensemble averaging; and, finally, — max(-,0). 
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the QAM constellation. Hence, /3o < /?i < • • • < /^Af form the basic rate set B and /3o is the base rate at which 
the nodes exchange control messages. Let dn be the minimum distance of the Af„-QAM constellation, the average 
transmitter energy per symbol is given by 



which is assumed to be fixed for all the nodes and data rates, i.e., it does not depend on the indices i and n. 
Consequently, the average power per symbol expended by each transmitter is Vs — £s/Ts (Watts). We consider a 
frequency non-selective block fading model, where € C denotes the fading coefficient over the i — > £ link in 
time slot t, with i 7^ £ € {0, 1, . . . , Af }, and i = or ^ = corresponding to the AP. It is assumed that all the 
channels are dual, i.e., \hf-\ = \hf\, and that the fading coefficients are independent and identically distributed 
(i.i.d.) with respect to t. Moreover, we define G ([^MxM jj^g matrix collecting the fading coefficients among 
all of the nodes and the AP, i.e., {^t}a = /if, for i / ^ € {0, 1, . . . , M}. 

At the PHY layer, there are two transmission modes to choose from: direct and cooperative. In the direct 
transmission mode, as shown in Fig. 1, the ith source node transmits directly to the AP at the data rate I3'f € B 
(bits/second) for the assigned transmission time of Rx\ seconds. In the cooperative transmission mode, some nodes 
serve as decode-and-forward relays. Specifically, in the cooperative mode, the assigned transmission time is divided 
into two phases as illustrated in Fig. 1 : in Phase I, the ith source node directly broadcasts its own data to all the nodes 
in the network at the data rate /3j'^ € for i? pi x\ seconds, where < < 1 is the Phase I time fraction; in Phase 
II, some of the nodes overhearing the source transmission, belonging to a certain subset Q C {1,2,..., M} — {i}, 
demodulate the data received in Phase I, re-modulate the original source bits, and then cooperatively transmit towards 
the AP, along with the original source i, at the data rate /J^*'^ G B for the remaining R{1 — pi) xl seconds. In the 
sequel, we denote with ^*''^°°p (bits/second) the cooperative data rate over the two phases, i.e., the amount of bits 
that are transmitted in a single phase divided by the overall length of the two phases, which depends on the data rates 
/3l'^ and attainable in each of the two hops. The decision to transmit in the direct or cooperative transmission 
mode depends on fading coefficients throughout the network in time slot t and on the target packet enor rate (PER). 
Thus, the actual transmission rate of the ith source in time slot t is dictated by the cooperation decision zl S {0, 1}, 
where = 1 if cooperation is chosen, and z| = if direct transmission is chosen. In Section III, we compute the 
transmission parameters I3f and ^*''^°°p as functions of a subset of the entries in H^, as well as the time fraction pi, 




(Joules) 



(II. 1) 
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and, in Section V, we describe how to determine the set of cooperative relays Q and the cooperation decision zl- 
B. APP layer model and packet scheduling 

The source traffic can be modeled using any Markovian traffic model (e.g. [10], [19]). However, to accurately 
capture the characteristics of the video packets, we adopt the sophisticated video traffic model proposed in [10], which 
accounts for the fact that video packets have different deadlines, distortion impacts, and source-coding dependencies 
(whereas the model in [19] does not consider these characteristics). In this section, we describe the key features of 
this model, but because the problem formulation and novelty of this paper do not depend on the deployed traffic 
model (so long as the model is Markovian), we refer the interested reader to [10] for complete details. 

For i € {1,2,..., M}, the trajfic state Tf = {Tf, h]} represents the video data that the ith user can potentially 
transmit in time slot t, and comprises the following two components: the schedulable frame set and the buffer state 
b^. In time slot t, we assume that the ith user can transmit packets belonging to the set of video frames Tl whose 
deadlines are within the scheduling time window (STW) [t,t + W]. The buffer state = [blj \j G J^t)'^ represents 
the number of packets of each frame in the STW that are awaiting transmission at time t. The jth component 6J j of 
hi denotes the number of packets of frame j G TJ: remaining for transmission at time t. We assume that each packet 
has size P bits. Fig. 2 illustrates how the traffic states are defined for a simple IBPB GOP structure.^ 

We now define the packet scheduling action. In each time slot t, the ith user takes scheduling action yj = 
{ylj \j G J^l)'^, which determines the number of packets to transmit out of bj. Specifically, the jth component ylj 
of yl represents the number of packets of the jth frame within the STW that are scheduled to be transmitted in time 
slot t. Importantly, the scheduling action yj is constrained to be in the feasible scheduling action set V''{Tt,Pt)^ 
which depends on the traffic state Tf and the transmission rate supported by the PHY layer In particular, the 
following three constraints must be met: 

1) Buffer: Every component of yj must satisfy Q < yl j <h\ ■. 

2) Packet: The total number of transmitted packets must satisfy Hy^Hi = XljeJ"' Vtj — where /3j = f3f in the 
direct transmission mode, i.e., when zl = 0, and /3| = ^*''^°°p in the cooperative transmission mode, i.e., when 

^In a typical hybrid video coder like H.264/AVC or MPEG-2, I, P, and B indicate the type of motion prediction used to exploit temporal 
correlations between video frames. I-frames are compressed independently of the other frames, P-frames are predicted from previous frames, 
and B-frames are predicted from previous and future frames. 
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zl = 1. Note that PI depends on a subset of the elements in as described later in Section III.^ 
3) Dependency: If there exists a frame k that has not been transmitted, and frame j depends on frame k (denoted 
by ^ ^ j), then k ~ vlkj Vt j — 0- other words, all packets associated with k must be transmitted before 
transmitting any packets associated with j. 
The sequence of traffic states {T^ : t € N} can be modeled as a controllable Markov chain with transition 
probability function p(7^Yi I Tt^yl)- 

III. Cooperative PHY layer transmission 

In this subsection, with reference to the uplink scenario, we describe how the direct transmission rate fif* and the 
cooperative transmission rate /3^''^°°'' depend on a subset of the elements in the channel state matrix Ht. 

Let us first consider the direct i ^ I link with instantaneous channel gain and data rate € B (bits/second) 
corrupted by additive white Gaussian noise. The bit error probability (BEP) /3f ) at the output of the maximum 

likelihood (ML) detector of node I, under the assumption that a Gray code is used to map the information bits into 
QAM symbols and the signal-to-noise ratio (SNR) is sufficiently high, can be upper bounded as (see [20]) 



2 (2ft"^= - 1) 



(in.i) 



where 7 = is the average SNR per symbol expended by the transmitter and Nq is the noise power spectral 
density. Each direct transmission is subject to a PER threshold at the MAC sublayer, which leads to a BEP constraint 
Pl^{hf- , Pf) < BEP at the PHY layer. Consequently, the achievable data rate (3f under the BEP constraint is 



/3f 



1 



iog2(i + r|/if|2 



where T ^ ^ ,^ \'bep^^ ■ (I^I-^) 



37 



2 |log. (^) 

The data rate Pf over the link between the source and the AP is obtained using (III.2) by setting ^ = 0. In this case, 
the number of symbols required to transmit a packet of P bits is equal to Kl^ = [P/(/3|° T^)]. Thus, neglecting 
receive and processing energy consumption, the energy required for a direct transmission of one packet is equal to 



£r = Kr£s = ^^=P^, (Joules). (in.3) 
It is worth noting that the energy expended in direct mode is inversely proportional to the achievable data rate fif^. 



'We do not include x\ in the packet constraint ||yi||i = X]jgj^» Vt.j < —p^ because x\ is not known at the time the scheduling decision 
y\ is determined. Once the scheduling decision is determined, the resource allocation x\ is determined as x\ — -^Tllytlli (see (IV.5)). 
Importantly, the stage resource constraint ensures that the scheduling decisions yj, Vi G {1, . . . , M}, are selected such that X^fi^^ Xt ^ 1- 
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At this point, let us consider the cooperative mode. Because of possible error propagation, the end-to-end BEP for 
a two-hop cooperative transmission is cumbersome to calculate exactly with decode-and-forward relays; therefore, 
the relationship that ties /3^'^, and the relevant channel state information, and that guarantees a certain reliability 
of the overall link, is not as simple as (III.2). To significantly simplify the computation of and /3^'^, we use 
two different BEP thresholds BEPi and BEP2 for the first and second hops, respectively. The threshold BEPi is 
typically a large percentage of the total error rate budget, say BEPi = 0.9 BEP, and BEP2 = BEP — BEPi, 
since the first link is the bottleneck in decode-and-forward relaying. Indeed, the performance at each relay is that of 
a single-input single-output system transmitting over a fading channel. On the other hand, the transmission over the 
second link (from the recruited relays to the destination) can be regarded as a distributed multiple-input single-output 
system operating over a fading channel; consequently, the performance at the destination, which can take advantage 
from cooperative diversity, is significantly better than that of each source-to-relay link, even when a small number 
of relays are recruited. Moreover, due to this fact and since the exponential function in (III.l) decays fast as a 
function of its argument, we reasonably assume that the end-to-end BEP at the output of the ML detector of the AP 
is dominated by the BEP over the worst source-to-relay channel, i.e., the link for which \hf\ is the smallest one. 
Under this assumption, accounting for (III.2), we can estimate /3^''^ in Phase I as 



logs ( l + Ti mini /if 1 2 
' tea 



(111.4) 



where Fi is obtained from F by replacing BEP with BEPi. In this phase, which lasts R pi x\ seconds, the number 
of symbols needed to transmit a packet of P bits is equal to Kl'^ = [P/(/3^''^ T^)] and, thus, it must result that 

K'/T, = ^ = R(f^x\ =^ P = R/3^'pi4. (in.5) 

Supposing that a subset Q of the available nodes are recruited to serve as relays in Phase II, these nodes, along 
with the ith user, cooperatively forward the source message by using a randomized STBC rule [16]. More specifically, 
assuming error-free demodulation at the decode-and-forward relays, if G C^* gathers the block of i.i.d. QAM 
source symbols to be transmitted in Phase II of time slot t, then at the ^th node, for each £ € {i} U Q, the vector 
aj is mapped onto an orthogonal space-time code matrix € C*^^^ [21], where Q is the block length and 

L denotes the number of antennas in the underlying space-time code. During Phase II, the ith node transmits a 
linear weighted combination of the columns of G{a.l), with the weights of the L columns of ^(aj) contained in the 
vector Ti G C-^. We denote with R = (r^ | £ G Q) G C^''^* the weight matrix of all the cooperating nodes, where 
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Nl < M is the cardinality of Q.^ Under the randomized STBC rule, the AP observes the space-time coded signal 
G{slI) with equivalent channel vector h^'^ = hf ri + Rh^'^, where hj'^ = {hf \£ € Q)"^ G C^* collects all the 
channel coefficients between the relay nodes and the AP (see Fig. 1). Note that the AP only needs to estimate hj'^ 
for coherent ML decoding and that the randomized coding is decentralized since the ^th relay chooses locally. By 
capitalizing on the orthogonality of the underlying STBC matrix G{al), the BEP ^^^'^(hj'^, over the second hop 
at the output of the ML detector of the AP using data rate (bits/second) can be upper bounded as in (IIL 1) by 
replacing \hf\'^ and Pf with ||hj'^|p and respectively. By imposing the BEP constraint P^*'^(hj'^, /5^*'^) < BEP2, 
the data rate attainable on the second hop of the cooperating link is given by 



= 7^ iog2[i + r2(|/irr + ||Rh 



i0|2 , ii-o U«.2||2n, 



(in.6) 



where r2 is obtained from T in (in.2) by replacing BEP with BEP2. In this phase, which lasts R{1 — p\)x\ 
seconds, the number of symbols needed to transmit a packet of P bits is equal to Kl'"^ = [P/(/3j'^ T^)] and, thus, it 
must result that 

Q'^s = -^ = R{l-p\)x\ =^ P = RRc/3i'^il-pl)xl, (111.1) 

P-cPt 

where Rc = K^"^ /Q < 1 is the rate of the orthogonal STBC rule. From (in.5) and (III.7), the transmission time for 
the two phase communication mode is 

P P ( I I \ P 

Rx\ = — + =p\ + ^\=— — , (in.8) 



which also unveils what is the functional dependence of /3l'^°"^ on Pi' and Pi' . Moreover, from (IILS) and (III.7), 
it is required that 

RPl''plxl = RR,Pl''{l-pl)xl =^ pl= ,^i],,,2^, ^ (III-9) 

1 + Pt' l\Pt Rc) 

which shows that, given the STBC rule, the time fraction p\ is determined by the data rates in Phase I and IL 
The cooperative mode is activated only if the cooperative transmission is more data-rate efficient than the direct 
communication, i.e., only if 01''^°°^ > pf', which from (IILS) leads to the following condition 

+ — ^ < At • (in.io) 

Pl:^ R,pf Pf 

"^One specific code of the STBC matrix is always assigned to the source itself, which transmits over the cooperative link every time 
cooperation is activated. This can be accounted for by simply setting = (1,0. .. ,0)"^ and replacing the first row of R with (0. . . ,0), 
whereas the remaining entries of R are identically and independently generated random variables with zero mean and variance l/L. 
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If condition (III. 10) is fulfilled, then the opportunistically optimal cooperation decision is = 1 ; otherwise, the ith 
source transmits to the AP in direct mode and zl = 0. 

It is interesting to evaluate the energy consumption in the case of a cooperative transmission. Neglecting receive 
and processing energy consumption, the energy expended by the source i for transmission of one packet is equal to 

^.source ^ /^^.l ^ J^^,2^ = p (Joulcs), (III.l 1) 

Pt 



whereas the energy expended by each recruited relay node for transmitting one packet of the ith source is given by 

Vs 



^i,relay ^ ^i,2 ^ p (Joulcs). (III. 12) 



It is noteworthy from (III.3) and (III.l 1) that, since cooperation is activated only when /3^' > /3j , the energy 
expended by the source node i for a cooperative transmission is smaller than that required by the same node for a 
direct transmission. On the other hand, the energy (III. 12) expended by the relays is inversely proportional to the 
achievable data rate in Phase II. Therefore, provided that Rc ^ f^f, over a sufficiently long period, the energy 
expenditure in relaying another node's data can be partially compensated for when the recruited relay acts as a source 
in the network. The total energy expended in the network to transmit ||yj||i packets for user i can be expressed as 

\rt\\i£^i^, if 4 = 0; 



(III. 13) 

\yl\\i{£i'— + Nl£r''y) , \fzl = l. 



£1 {yl 4,ci) = < 

The energy consumption in the direct and cooperative modes is numerically compared in Section VI. 

IV. Cooperative Multi-User Video Transmission 

Recall that denotes the zth user's traffic state and collects the channel coefficients among all the nodes and 
the AP. Hence, the global state can be defined as = (7^^, 7^, . . . , 7^^^, H^) G S, where 5 is a discrete set of all 
possible states.^ Since: (i) the ith user's traffic state evolves as a Markov process controlled by its scheduling action 
yl; (ii) the ith user's traffic state transition is conditionally independent of the other users' traffic state transitions 
given yl; and (iii) the state of each i — )• £ link hf is assumed to be i.i.d. with respect to time; the sequence of global 
states {s( : t G N} can be modeled as a controlled Markov process with transition probability function 

AI 

p{st+i \st,yt)=p (Ut+i) n piV+i I V,yi) , (IV. 1) 

'To have a discrete set of network states, the individual link states in Ht are quantized into a finite number of bins (see [24] for details). 
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where = ({y^ {y^ • • • , {Yt ^}'^)'^ collects the scheduling actions of all the video users. 
Under the scheduling action yl, the ith user obtains the immediate utility 



\V,yi)=T.Qiyij, (IV.2) 

which is the total video quality improvement experienced by the ith user by taking scheduling action y] in traffic 

state under the assumption that quality is incrementally additive [17]. 

The objective of the MU optimization is the maximization of the expected discounted sum of utilities with respect 

to the joint scheduling action yt and the cooperation decision vector = {z}, zf,..., z^'^)'^ taken in each state s^. 

Due to the stationary Markovian transition probability function, the optimization can be formulated as an MDP that 

satisfies the following dynamic programming equation^ 

r M M -\ 

U*{s) = max <j "'CT, y') + « E pC^'^ Up^^^' I ^*(^') \ ' ^1^.3) 

subject to 

yi (z^i{j-^^p^) and <l {IN A) 

i=l 

where x* is the time-fraction allocated to the ith user given its scheduling action y* and transmission rate /3*, i.e., 

P 



i=l s'g5 i=l 



M 



\r\\i, (IV.5) 



the parameter a € [0, 1) is the "discount factor", which accounts for the relative importance of the present and future 
utility, and P*(T*, H) is the set of feasible scheduling actions given the traffic state T* and channel state matrix H. 
From Theorem 6.2.5 in [26], we know that there exists a stationary optimal poUcy that is the global optimal solution 
to (IV.3) . 

Given the distributions p(H) and p(T*' j T*, y*) for all i, the above MU-MDP can be solved by the AP using value 
iteration or policy iteration [18]. However, there are two challenges associated with solving the above MU-MDP. 
First, the complexity of solving an MDP is proportional to the cardinality of its state-space S, which, in the above 
MU-MDP, scales exponentially with the number of users, i.e., M, and with the number of links in H, i.e., M^. 
Hence, even for moderate sized networks, it is unpractical to compute, or even to encode, U*{s). In subsection IV-A, 
we show that the exponential dependence on the number of links in H can be eliminated. Second, in the uplink 

*In this section, since we model the problem as a stationary MDP, we omit the time index when it does not create confusion. In place of 
the time index, we use the notation (■)' to denote a state variable in the next time step (e.g. 7~", H', s'). 
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scenario, the traffic state information is local to the users, so neither the AP nor the users have enough information 
to solve the above MU-MDP. In subsection IV-B, we summarize the findings in [10] that show that the considered 
optimization can be approximated to make it amenable to a distributed solution. Additionally, this distributed solution 
eliminates the exponential dependence on the number of users. Note that the simplification in subsection IV-A is 
very important, because only after obtaining this result does it become possible to use the solution in [10]. 

A. Reformulation with simplified network state 

The only reason to include the detailed network state information H and the cooperation decision z in the MU- 
MDP is to make foresighted cooperation decisions, which take into account the impact of the immediate cooperation 
decision on the expected future utility of the users. However, if we can show that the optimal opportunistic (i.e., 
myopic) cooperation decision is also long-term optimal, then the detailed network state information does not need 
to be included in the MU-MDP. The following theorem shows that the optimal opportunistic cooperation decision, 
which maximizes the immediate transmission rate, is also long-term optimal. 

Theorem 1 (Opportunistic cooperation is optimal): If utilizing cooperation incurs zero cost to the source and 
relays, then the optimal opportunistic cooperation decision, which maximizes the immediate throughput, is also 
long-term optimal. 

Proof: See Appendix I. ■ 
To intuitively understand why maximizing the immediate transmission rate at the PHY layer is long-term optimal, 
consider what happens when a user chooses not to maximize its immediate transmission rate (i.e., does not utilize 
the optimal opportunistic cooperation decision). Two things can happen: either less packets are transmitted overall 
because of packet expirations; or, the same number of packets are transmitted overall, but their transmission incurs 
additional resource costs because transmitting the same number of packets at a lower rate requires more resources 
[see (IV.5)]. In either case, the long-term utility is suboptimal. A consequence of Theorem 1 is that the cooperation 
decision vector z does not need to be included in the MU-MDP. Instead, it can be determined opportunistically by 
selecting z to maximize the immediate transmission rate. Most importantly, this means that the MU-MDP does not 
need to include the high-dimensional network state. 

We now make two remarks regarding Theorem 1 so that its consequences are not misinterpreted. First, in the 
introduction, we noted that maximizing throughput is a suboptimal multiple access strategy for wireless video. This 
does not contradict Theorem 1 because it only states that the cooperation decision should be made opportunistically 
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to maximize the immediate transmission rate. Indeed, myopic (opportunistic) resource allocation and scheduling is 
suboptimal because it does not take into account the dynamic video data attributes (i.e., deadlines, priorities, and 
dependencies). Second, although the users' MDPs do not need to include the high-dimensional network state, the 
optimal resource allocation and scheduling strategies still depend on it; however, instead of tracking Hf , it is sufficient 
to track the users' optimal opportunistic transmission rates provided by the PHY layer, i.e., /3| for all i. Under the 
assumption that the channel coefficients are i.i.d. random variables with respect to t, j^l can also be modeled as an 
i.i.d. random variable with respect to t. We let denote the probability mass function (pmf) from which fil is 

drawn. We note that depends on p(H) and the deployed PHY layer cooperation algorithm. 

Based on the second remark, we can simplify the maximization problem in (IV.3). Let us define the fth user's 
state as s* = (T*, /?*) G 5* and redefine the global state as s = (s^, . . . , s^^y . In Section V, we describe how /3* is 
determined, but for now we will take for granted that it is known. Because the optimization does not need to include 
the cooperation decision, the maximization of the expected sum of discounted utilities in (IV.3) can be simplified by 
only maximizing with respect to the scheduling action y in each state s, that is, 

{M M ^ 

1=1 s'e<Si=i J 

subject to 

M 

y'eV'iV,^') and Yx'<l, (IV.7) 

i=l 

where p{s^' \s\y^)= p{l3'') p{T' \T\y'). 
B. Distributed solution 

Similar to [10], (IV.6) can be reformulated as an unconstrained MDP using Lagrangian relaxation. The key idea is 
to introduce a Lagrange multiplier Ag associated with the stage resource constraint J2i=i a^* < 1 in each global state 
s because every global state has a different resource-quality tradeoff. The resulting dual solution has zero duality 
gap compared to the primary problem [i.e., (IV.6)], but it still depends on the global state so it is not amenable to 
a distributed solution. However, by imposing a uniform resource price Ag = A, Vs G S, which is independent of the 
multi-user state, the resulting MU-MDP can be decomposed into M MDPs, one for each user [10].^ These local 

^We note that the resource price is only used to efficiently allocate the limited wireless resources among the users; it is not used to generate 
revenue for the AP. In other words, it is a congestion price rather than a real price. 
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MDPs satisfy the following dynamic programming equation 



t/^'*(/,A) = max u\V,r)-X[x' - — 



) 



.i' 



(IV.8) 



M 



A>0 ^ 



i=l 



(IV.9) 



subject to y* € 'P*(7'*, /3*). Importantly, the ith user's dynamic programming equation defines the optimal scheduling 
action as a function of the ith user's state, rather than the global state s. In this paper, the ith user solves (IV.8) 
offline using value iteration; however, it can be easily solved online using reinforcement learning as in [10] and [19]. 
Also, note that due to the distributed nature of the proposed algorithm, the stage resource constraint X]f=i < 1 is 
not guaranteed to be satisfied during convergence or at steady-state. Because the stage resource constraint may be 
violated, it must be enforced separately by the AP, which we assume normalizes the requested resource allocations 
and, subsequently, has the users recompute their scheduling policies to satisfy the new allocations. 

Although the optimization can be decomposed across the users, the optimal resource price A still depends on all 
of the users' resource demands. Hence, A must be determined by the AP in both the uplink and downlink scenarios. 
Specifically, the resource price can be numerically computed by the AP using the subgradient method. The subgradient 
with respect to A is given by Xli^i ~ where = E [Xltt^ I ^h] is the ith user's expected discounted 
accumulated resource consumption, which can be calculated as described in [10]. Importantly, can be computed 
locally by the ith user in the uplink scenario and by the AP in the downlink scenario. Using the subgradient method, 
the resource price is updated as 



where fi^ is a diminishing step size. Since the focus of this paper is on the interaction between the multiuser video 
transmission and the cooperative PHY layer, we refer the interested reader to [10] for complete details on the dual 
decomposition outlined in this subsection, and the derivation of the subgradient with respect to A. 

We note that a similar decomposition has recently been proposed for energy-efficient uplink scheduling with delay 
constraints in multiuser wireless networks using a different MU-MDP framework [19]. Besides the fact that [19] does 
not consider physical layer cooperation or heterogeneous traffic characteristics, there is one significant difference 
between the decomposition in [19] and the one adopted in this paper Specifically, the TDMA-like protocol in [19] 
assumes that only one user can transmit in each time slot, whereas we consider a TDMA-like protocol in which each 




(IV. 10) 
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time slot is divided into different length transmission opportunities for each user. Moreover, in [19], every user has 
a unique Lagrange multiplier associated with its average buffer delay constraint. In contrast, in our decomposition, 
all users have the same Lagrange multiplier, which regulates the resource division among the users, rather than 
their individual delay constraints. Note that, in this paper, delay constraints are included in the application model. 
Importantly, Theorem 1 applies to the MU-MDP formulation in [19] and therefore the recruitment protocol proposed 
in Section V can be used to integrate cooperation into [19]. In other words, the novelty and technical contributions 
of this paper are independent of the dual decomposition in [10], which we only use for illustrative purposes. 

V. Recruitment protocol 

With reference to the uplink scenario, we define our opportunistic cooperative strategy to select distributively the 
set of cooperative relays Q and make the decision zl at the AP. The downlink case is a minor variation. 

Importantly, the AP can exactly evaluate in (III.6) because it can estimate and RhJ'^ via training as 
mentioned in Section III. However, the trouble in recruiting relays on-the-fly is that the AP and the relays cannot 
directly compute Pl'^ given by (III.4), since they cannot estimate the channel coefficients hf, for all I € Q. Some 
MAC randomized protocols have recently been proposed [22], [23], which get around the problem that the AP and 
the relays do not have the necessary channel state information to determine However, such protocols require 
the exchange and/or the tracking of a large amount of network parameters that may incur unacceptable delays in a 
wireless video network. In particular, the first- and second-hop data rates are computed in [23] by the source node 
using the average PER evaluated by simulations. To quickly setup the cooperative transmission and, thus, reduce the 
delays, we propose a much simpler recruitment scheme that is based on the closed-form formulas (III.4) and (III.6). 
The proposed four-way protocol is reminiscent of the request-to-send (RTS) and clear-to-send (CTS) handshaking 
used in carrier sense multiple access with collision avoidance (CSMA/CA), which is extended to include a helper- 
ready to send (HTS) control message that is cooperatively transmitted by the relays using randomized STBC and a 
cooperative recruitment signal (CRS) that is sent by the AP to recruit relays. The idea of sending the HTS frame in 
cooperative mode has been originally proposed in [23]. However, apart from the use of the HTS control message, 
the proposed protocol is different from that of [23] because we use a completely different recruitment policy. 

All the control frames are transmitted at the base rate /3o such that they can be decoded correctly, and the thresholds 
BE Pi and BEP2, as well as L and Rc, are fixed parameters that are known at all the nodes. Fig. 3 illustrates the 
signaUng protocol for time slot t, which consists of the nine steps detailed in Table I. We would like to highUght 
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that, similar to the data transmitted in Phase II, the HTS message is a cooperative signal, i.e., all relays jointly deliver 
the HTS frame using randomized STBC at the same time and, hence, simultaneous transmissions do not cause a 
collision. With reference to Table I, the key observation is that the selection of the set Q by virtue of (VII.4) is 
done in a distributed way and, moreover, by simply having access to the channel state from the source i to itself, 
i.e., h]^, the ^th candidate cooperative node can autonomously determine if, by cooperating, it can improve the data 
rate of node i. Another important observation is that the recruitment of the cooperative nodes and the assignment of 
the data rates requires only four control messages for each source. In particular, the control information exchange is 
independent of the number of recruited relays thanks to the randomization of the cooperative transmission. Moreover, 
the two parameters and L need to be chosen appropriately. The best choice for and L requires global network 
information. A learning framework would be very appropriate for their selection but we defer the treatment of this 
aspect to future work. Finally, as for the impact of L on the network performance, it should evidenced that randomized 
channels tend to behave statistically like their non-randomized counterparts [16], with deep-fade events that become 
as frequent as those of L independent channels, as long as the number of cooperative nodes N'l > L + 1. 

VI. Numerical Results 

We consider a network with 50 potential relay nodes placed randomly and uniformly throughout the 100 m coverage 
range of a single AP as illustrated in Fig. 4. We specify the placement of the video source(s) separately for each 
experiment. Let rif denote the distance in meters between the ith and £th nodes. The fading coefficient hf over the 
i ^ I link is modeled as an i.i.d. CA/'(0, {Tft)"^) random variable, where 5 is the path-loss exponent. Additionally, we 
assume that the entries of R, defined in Section III, are i.i.d. CJ\f{0, j^) random variables, where L is the length of the 
STBC. If an eiTor occurs in the packet transmission, then the packet remains in the frame buffer to be retransmitted 
in a future time slot (assuming the packet's deadline has not passed). 

Due to space constraints, and because cooperation has the same impact in both uplink and downlink scenarios, we 
only present results for cooperative uplink video transmission. In particular, we consider four uplink scenarios: 

1) Single source: In this scenario, we assume that a single source node is placed between 10 and 100 m directly to 
the right of the AP in Fig. 4. We use this scenario to evaluate the transmission rates in the direct and cooperative 
transmission modes at different distances from the AP, and to determine a good self-selection parameter ^. 

2) Homogeneous video sources: This scenario mimics a surveillance application in which three cameras capture 
correlated video content in an outdoor environment and transmit it to the AP. The video sources are placed to the 
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right of the AP as illustrated in Fig. 7. To simulate correlated content, we assume that each of the three cameras 
stream the Foreman sequence (CIF resolution, 30 Hz framerate, encoded at 1.5 Mb/s) offset by several frames. 
Using homogeneous sources allows us to isolate the impact of cooperation on the video streaming performance 
by removing the additional layer of complexity introduced by heterogeneous video sources (e.g. different packet 
priorities and bit-rates among the video users). 

3) Heterogeneous video sources 1: This scenario mimics a network in which users deploy entertainment appli- 
cations such as video sharing or video conferencing. To simulate this, we assume that the three video sources 
illustrated in Fig. 7 transmit heterogeneous video content to the AP. Specifically, we assume that video user 1 
streams the Coastguard sequence (CIF, 30 Hz, 1.5 Mb/s), video user 2 streams the Mobile sequence (CIF, 30 
Hz, 2.0 Mb/s), and video user 3 streams the Foreman sequence (CIF, 30 Hz, 1.5 Mb/s). 

4) Heterogeneous video sources 2: This is the same as the previous scenario, but with video user 2 streaming the 
Foreman sequence and video user 3 streaming the Mobile sequence. 

We note that the proposed framework can be applied using any video coder to compress the video data. However, 
for illustration, we use a scalable video coding scheme [25], which is attractive for wireless streaming applications 
because it provides on-the-fly application adaptation to channel conditions, support for a variety of wireless receivers 
with different resource and power constraints, and easy prioritization of video packets. 

In our results, we deploy the proposed randomized STBC cooperation protocol outlined in Table I and determine 
the optimal resource allocation and scheduling decisions using the distributed optimization introduced in Section IV-B. 
The relevant simulation parameters are given in Table II. Note that, in the homogeneous and heterogeneous scenarios 
described above, we simulate a network with a "high" transmission rate, using the symbol rate jr = 1250000, and 
a network with a "low" transmission rate, using the symbol rate jr = 625000 symbols/second. 

A. Transmission rates and energy consumption 

In this subsection, we consider the single source scenario described above. Fig. 5 illustrates the performance of 
the proposed cooperation protocol for time-invariant self-selection parameter values = £, ^ {0.1, 0.2, . . . , 0.5}, and 
the performance of direct transmission, given a single source transmitting to the AP. Note that these results hold 
regardless of the symbol rate. In particular, the "transmission rate" in Fig. 5(a) is presented in terms of the spectral 
efficiency (bits/second/Hz); the probability of cooperation in Fig. 5(b) and the average number of recruited relays in 
Fig. 5(c) only depend on the spectral efficiency; and the energy results reported in Figs. 5(d-f) are normalized by 
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setting the symbol energy £, = ^ (or, equivalently, Vs = j;) in (III.3), (III. 11), and (III. 12). 

From Fig. 5(a), it is clear that nodes further from the AP utilize cooperation more frequently than nodes closer 
to the AP. This is because, on average, distant nodes have the feeblest direct signals to the AP due to path-loss and, 
therefore, have the most to gain from the channel diversity afforded to them by cooperation. It is also clear from 
Fig. 5(a) that cooperation is utilized more frequently as the self-selection parameter ^ increases. This is because, as 
illustrated in Fig. 5(c), more relays satisfy the self-selection condition in step 5 of Table I for larger values of ^. 

However, larger values of ^ yield relay nodes for which %r is large, which leads to a bad transmission rate over 

Pt 

the bottleneck hop-1 cooperative link. Due to this poor bottleneck rate and the large number of recruited relays, the 
average transmission rate shown in Fig. 5(b) declines for ^ > 0.2 even while the total energy consumption increases 
as illustrated in Fig. 5(d). In contrast, lower values of the self-selection parameter (e.g. ^ < 0.2) lead to too few 
nodes being recruited to achieve large cooperative gains, but yield lower energy consumption. Interestingly, the same 
properties of relay nodes that are desirable for achieving the best transmission rate - a balance between the number 
and quality of relays - is also important for achieving a high throughput-to-energy ratio. For example. Fig. 5(e) shows 
us that at 100 m from the AP, the average throughput-to-energy ratio for cooperative transmission with ^ = 0.2 is a 
little less than 0.8, which is close to the throughput-to-energy ratio of a direct transmission, which is 1 at 100 m. 

Although the average network energy required to support a cooperative transmission is larger than that required for a 
direct transmission, this increase is moderate compared to the amount of energy the source node would have to expend 
in order to achieve the same transmission rate as the cooperative transmission, i.e., to attain I3f = ^*''^°°p requires a 
large increase in the transmission power with respect to the cooperative case. This is illustrated in Fig. 5(f), where, 
for example, it is shown that transmitting in the direct mode at the rate attainable under cooperative transmission 
with ^ = 0.2 requires approximately 13.5 normalized Joules/Packet compared to approximately 3.5 normalized 
Joules/Packet in the cooperative case shown in Fig. 5(d). ^ 

In the remainder of our experiments, we let the self-selection parameter = ^ = 0.2 because, as illustrated in 
Figs. 5(b,e), this value provides a large average transmission rate over the AP's entire coverage range and a high 
throughput-to-energy ratio. With ^ = 0.2, Fig. 7 illustrates the activation frequencies for different relays and Fig. 6 

*The results in Fig. 5(f) were obtained by fixing the transmission rate and adapting the symbol energy, which is in contrast to the current 
problem formulation in which we fix the symbol energy and adapt the transmission rate. Specifically, we calculated the symbol energy £s 
required to set /3f = by rearranging (III.2). Note that we could also force = f]^^ to achieve lower energy consumption at the 

same transmission rate as the direct mode. 
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illustrates the average energy consumed by the source and relay nodes. Notice that, under a cooperative transmission, 
the source node actually uses less power than under a direct transmission, which partially compensates for the extra 
energy it may expend acting as a relay for other nodes. 

B. Transmission rate, resource price, and resource utilization 

Fig. 8 illustrates the average transmission rates achieved by the video users in the homogeneous and heterogeneous 
scenarios in networks that support high and low transmission rates. Recall that the resource cost x\ incurred by user 
i is inversely proportional to the transmission rate [see (IV. 5)], which decreases as the distance to the AP increases 
due to path loss. Hence, when only direct transmission is available, user 3 tends to resign itself to a low average 
transmission rate because the cost of using resources is too high. Cooperation increases the average transmission 
rate, thereby providing user 3 lower cost access to the channel to transmit more data. 

In the homogeneous scenario illustrated in Fig. 8(a), cooperation tends to equalize the resource allocations to the 
three users (this is especially evident in the cooperative case with a high transmission rate). This is because the 
homogeneous users have identical utility functions; thus, when sufficient resources are available, it is optimal for 
them to all operate at the same point of their resource-utility curves. In contrast, when heterogeneous users with 
different utility functions are introduced, the transmission rates change to reflect the priorities of the different users' 
video data. Observing Fig. 8(b,c), it is clear that the additional resources afforded by cooperation tend to go to the 
highest priority video user, who, in our simulations, is the user streaming the Mobile sequence. 

Recall that users autonomously optimize their resource allocation and scheduling actions given the resource price A 
announced by the AR Table III illustrates the optimal resource prices in the homogeneous and heterogeneous scenarios 
along with the average network resource utilization, i.e. the average of X]f=i There are several interesting results 
in Table III. First, the average network resource utilization is often considerably less than the total available resources. 
This is due to the distributed nature of the resource allocation and scheduling algorithm, which requires users to 
be conservative in their resource usage to ensure feasible allocations. Second, in the cooperative transmission mode, 
the resource price tends to increase and the utilization tends to decrease when going from a high rate to a low rate 
network, regardless of the streaming scenario. The resource price increases because the network supports lower rates, 
but the demand stays the same, which increases congestion. The utihzation decreases because lower rates yield a 
coarser set of feasible resource allocations for each user (see (IV.5)). Third, in the high rate network, the resource price 
tends to decrease and the utilization tends to increase when going from the direct to the cooperative transmission 
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mode, regardless of the streaming scenario. The resource price decreases because cooperation floods the network 
with resources without significantly impacting demand, which reduces congestion. The utilization increases because 
the cooperative transmission mode supports higher transmission rates, which yield a finer set of feasible resource 
allocations for each user (see (IV.5)). Finally, in the low rate network, the resource price and utilization tend to 
increase when going from the direct to the cooperative transmission mode. In contrast to the high rate network, the 
resource price increases because users that resigned themselves to very low transmission rates in the direct scenario 
suddenly demand resources when cooperation is enabled. The resource price increases in our simulations because the 
enlarged demand pool exceeds the additional supply of resources that is introduced by cooperation. In other words, 
users that would like to transmit video, but are too far from the AP for a direct transmission, are essentially absent 
from the network when only direct transmission is available, and therefore do not significantly impact the resource 
price and resource utilization; however, when cooperation is enabled, these users are suddenly within range of the 
AP, and will therefore demand resources, which increases congestion. As in the other cases, the utilization increases 
because the transmission rate increases. 

C. Discounted utility and video quality comparison 

Table IV compares the expected value of the objective function in (IV.9) (with respect to the stationary distribution 
over the states) obtained in the homogeneous and heterogeneous scenarios. Because the objective function includes a 
Lagrangian cost term, it is not always indicative of the corresponding video quality. For this reason, we also include 
Table V to compare the video quality obtained in the homogeneous and heterogeneous scenarios, where video quality 
is measured in terms of peak-signal-to-noise ratio (PSNR in dB) of the luminance channel. In the network that 
supports a high transmission rate, the user furthest from the AP (user 3) benefits on the order of 5-10 dB PSNR 
from cooperation, while the video user closest to the AP (user 1) is penalized by less than 0.4 dB PSNR. In the 
network that only supports low transmission rates, user 3 goes from transmitting too little data to decode the video 

(denoted by " ") to transmitting enough data to decode at low quality, while penalizing user 1 by less than 

0.8 dB PSNR. Note that these PSNR results implicitly reflect the end-to-end delay from the source, through the 
relays, to the destination. This is because the sophisticated traffic model in subsection II-B accounts for the fact that 
frames that are not entirely received before their deadlines, and frames that depend on them, cannot be decoded and 
therefore do not contribute to the received video quality. 
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VII. Conclusion 

We introduced a cooperative multiple access strategy that enables nodes with high priority video data to be 
serviced while simultaneously exploiting the diversity of channel fading states in the network using a randomized 
STBC cooperation protocol. We formulated the dynamic multi-user video transmission problem with cooperation as 
an MU-MDP and we used Lagrangian relaxation with a uniform resource price to decompose the MU-MDP into 
local MDPs at each user. We analytically proved that opportunistic (myopic) cooperation strategies are optimal, and 
therefore the users' local MDPs only need to determine their optimal resource allocation and scheduling pohcies 
based on their experienced cooperative transmission rates. Subsequently, we proposed a randomized STBC cooperation 
protocol that enables nodes to opportunistically and distributively self-select themselves as cooperative relays. Finally, 
we experimentally showed that the proposed cooperation strategy significantly improves the video quality of nodes 
with feeble direct links to the AP, without significantly penalizing other users, and with only moderate increases in 
total network energy consumption. 

Appendix I: proof of Theorem 1 

The transmission rate /3* is a function of the cooperation decision and the channel state H, i.e., we can write 
= /3' (H, z*). Thus, the cooperation decision impacts the immediate utility because it constrains the set of feasible 
scheduUng actions "P* (T*,/3*) through the packet constraint ||y*||i < 

Let z*pp = argmax^i |/3* (H,z*)} and /3*pp = max^i (H,z*)} denote the optimal opportunistic cooperation 
decision and the maximum transmission rate, respectively. Selecting the cooperation decision that maximizes the 
immediate transmission rate enlarges the set of feasible scheduling actions, i.e., "P* (T*,/?*) C P* {T\fil*pp), for all 
/3* < /?opp- i^ow show that the optimal opportunistic cooperation decision enables a user to maximize its long-term 
utility for any a > 0. Let (T^j/S^y*) = XljeJ^* '^j^j — A (x* — denote the utility less the cost, where is 
given by (IV.5). Under the optimal opportunistic cooperation decision, we have 

t/r(/) = max lu\{T\p:ipy)+a^p{s^'\s\r)Ul:*{s^')\ (VII.l) 

> max \u{{V,P\r)+a^p{s''\s\r)ui:*{s'')\=Ui{s') , (VII.2) 
ye-PHT',/?') ^ ^ J 

where the inequahty is due to the fact that P* {T\ /3^) Q P* (T*,/3*pp) for all /3* < /?*pp. Thus, the optimal 

opportunistic cooperative decision maximizes the long-term utility. 
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Fig. 1. An uplink wireless video network with cooperation. A downlink wireless video network with cooperation can be visualized by 
switching the positions of node 1 and the access point. 
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Fig. 2. (a) Illustrative DAG dependencies and scheduling time window using IBPB GOP structure. The schedulable frame sets defined by 
the scheduling time window W are — {1, 2, 3}, J-t+i = {2, 3, 4, 1}, J-t+2 = {4, 1, 2, 3}, J-t+a = {2, 3, 4, 1}, etc. Clearly, J^t is periodic 
with period T = 3 excluding the initial time t, and each GOP contains A*' = 4 frames, (b) Traffic state detail for schedulable frame set 
Tt = {1, 2, 3}. bj denotes the state of the jth frame's buffer, where j Tt = {1, 2, 3}. 
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Fig. 3. Signaling protocol for randomized STBC cooperation. 
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TABLE I 

The proposed protocol for randomized STBC cooperation. 



Cl = \l:^,<^t\ , (VII.4) 



Step 1) The zth source initiates the handshaking by transmitting the RTS frame, which announces its desire to transmit data 

symbols and also includes training symbols that are used by the other nodes to estimate the link gains. 
Step 2) From the RTS message, the AP estimates the channel coefficients hf" and, hence, determines /Sj". At the same time, by 

passively listening to all the RTS messages occurring in the network, the other nodes estimate their respective channel 

parameters hf, for ££{1,2,..., A/} — {i}, and, thus, determine 
Step 3) The AP responds with the CRS message that provides feedback on /3f" to all the candidate cooperative nodes and the 

source, as well as a second parameter < .^t < 1, which is used to recruit relays. 
Step 4) From the CRS message, the ith source learns that a cooperative transmission may take place and, if such a communication 

mode will be subsequently confirmed by the AP, the data rate to be used in Phase I is given by 

niO 

= ^ . (VII.3) 
Step 5) After receiving the CRS frame, the candidate cooperative nodes can self-select themselves according to the rule: 

where is defined using (III.2) by replacing BEP with BE Pi. The nodes belonging to the formed group CI send 
in unison the HTS message using randomized STBC of size L as described in Section III, which piggybacks training 
symbols that are used by the AP to estimate the cooperative channel vector RhJ'^. 
Step 6) After estimating the channel of the cooperative link, the AP computes the data rate Pl''^ by resorting to (III. 6) and 
verifies the fulfillment of the following condition 

If (III.7) holds, then, accounting also for (VII.3), it can be infeiTed that cooperation is better than direct transmission, 
i.e., condition (III. 10) is satisfied: in this case, — 1. Otherwise, cooperation is useless: in this case, zl — 0. Therefore, 
the AP responds with a CTS frame, which conveys the following information: (i) the cooperation decision zl; (ii) if 
zl = 1, the data rate in Phase II given by (III. 6); (iii) the resource price A computed as explained in Section IV. 

Step 7) If 2( = 1 in the CTS frame, the source proceeds with sending in Phase I its data frame at rate (VII.3); otherwise, if 
zl = 0, it transmits in direct mode at the data rate 13 f. 

Step 8) If = 1 in the CTS frame, along with the source, the self-recruited relays cooperatively transmit in Phase II the data 
frame at rate Pl'^; otherwise, if zl = 0, they remain silent. 

Step 9) The AP finishes the procedure by sending back to the source an acknowledgement (ACK) message. 
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TABLE II 
Simulation parameters. 



Parameter 


Description 


Value 


L 


Length of the STBC 


2 




Rate of orthogonal STBC rule 


1 




Self-selection parameter 


0.1,0.2, 
0.3, 0.4, 
0.5 


P 


Packet size 


8000 bits 


BEP 


Bit error probability target 
(uncoded) 


10-^ 


5 


Path loss exponent 


3 




WLAN coverage radius 
(5 dB SNR at boundary) 


100 m 


M 


Number of nodes 
(excluding the AP) 


50 


a 


Discount factor 


0.80 




Symbol rate 
(symbols per second) 


625000 or 
1250000 


£s 


Symbol energy 
(normalized) 


T 

— Joules 

P 



100 



50 







-50 



-100 

-100 -50 50 100 



Fig. 4. Network topology used for numerical results. There are 50 nodes placed randomly and uniformly throughout the AP's 100 m coverage 
range. 
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Fig. 5. Cooperative transmission statistics for different values of tlie self-selection parameter ^ and for different distances from the AP. (a) 
Average transmission rate, (b) Probability of cooperation being optimal, (c) Average number of recruited relays, (d) Average energy consumed 
in the network per packet transmission, (e) Throughput per unit energy, (f) Average energy required by the source to transmit one packet at 
the rate /3f = l3'/°°^. 
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Fig. 6. Average energy consumed by source (Src) during direct and cooperative transmission, and average energy consumed by a relay (Rly) 
during cooperative transmission. A self-selection parameter ^ — 0.2 is used for cooperative transmission. 
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Fig. 7. Video source placement for homogeneous and heterogeneous streaming scenarios. Three video sources are placed 20 m, 45 m, and 
80 m from the AP at angles 25°, —30°, and 0°, respectively. (a,b,c) Relay activation frequencies for video source 1, 2, and 3, respectively, 
with self-selection parameter ^ = 0.2. The size of the relay is proportional to the frequency with which it is activated as a helper for the 
corresponding source. 



January 13, 2013 



DRAFT 



29 



Homogeneous 




Foreman 1 Foreman 2 Foreman 3 
(a) (20 m) (45 m) (80 m) 

Heterogeneous 1 




(b) 



Coastguard 
(20 m) 



Mobile 
(45 m) 



Foreman 
(80 m) 



1800 
1600 
1400 
1200 
1000 



U3 

.!5 800 
E 

c 600 



Cl 
CD 

a: 

c 
o 



400 
200 



Cooperative (Higli Rate) 
Direct (Higli Rate) 
Cooperative (Low Rate) 
Direct (Low Rate) 



Heterogeneous 2 





(c) 



Coastguard 
(20 m) 



Foreman 
(45 m) 



Mobile 
(80 m) 



Fig. 8. Average transmission rates in different scenarios, (a) Homogeneous video sources. (b,c) Heterogeneous video sources. 
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TABLE III 

Resource prices and resource utilization in different scenarios. 



Streaming 


Transmission 


Resource Price 


Utilization 


Scenario 


Mode 


(High /Low) 


(High /Low) 




Direct 


45.79/42.97 


0.73/0.67 


Homogeneous 


Cooperative 


38.72/52.56 


0.88/0.75 




Change 


-6.93 / 9.59 


0.15/0.08 


Heterogeneous 
1 


Direct 


51.01/53.17 


0.66/0.68 


Cooperative 


48.02/71.94 


0.89/0.77 


Qiange 


-2.99/18.77 


0.23 / 0.09 


Heterogeneous 
2 


Direct 


68.24/41.48 


0.65/0.56 


Cooperative 


62.61 /72.86 


0.89/0.67 




Qiange 


-5.63/31.38 


0.24/0.11 



TABLE IV 

Expected discounted average utility in different scenarios. 



Streaming 
Scenario 


Transmission 
Mode 


Video User 1 @ 20 m 
(High / Low) 


Video User 2 @ 45 m 
(High / Low) 


Video User 3 @ 80 m 
(High / Low) 






Foreman 


Foreman 


Foreman 


Homogeneous 


Direct 


199.1 / 149.6 


138.0/72.2 


35.63 /5.5 




Cooperative 


195.7/ 138.3 


179.0/90.2 


143.5 /30.7 


Heterogeneous 
1 




Coastguard 


Mobile 


Foreman 


Direct 


85.6/57.2 


306.6/ 176.8 


17.6/0.3 


Cooperative 


94.9/40.6 


386.1 / 187.4 


124.7/8.7 


Heterogeneous 
2 




Coastguard 


Foreman 


Mobile 


Direct 


76.0/67.7 


95.8/76.4 


69.5/29.5 


Cooperative 


81.7/40.2 


138.4/54.6 


257.0/81.0 
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TABLE V 

Average video quality (PSNR) in different scenarios. 



Streaming 
Scenario 


Transmission 
Mode 


Video User 1 @ 20 m 
(High /Low) 


Video User 2 @ 45 m 
(High / Low) 


Video User 3 @ 80 m 
(High / Low) 


Homogeneous 




Foreman 


Foreman 


Foreman 


Direct 


36.82 dB/ 36.51 dB 


35.85 dB/ 30.20 dB 


29.89 dB / — dB 


Cooperative 


36.69 dB/ 35.82 dB 


36.58 dB/ 34.83 dB 


36.04 dB/ 27.12 dB 


Change 


-0.13 dB/ -0.69 dB 


0.73 dB/4.63 dB 


6.15 dB/ — dB 


Heterogeneous 
1 




Coastguard 


Mobile 


Foreman 


Direct 


32.30 dB/ 31.09 dB 


26.74 dB / 24.53 dB 


25.94 dB / — dB 


Cooperative 


31.94 dB/ 30.89 dB 


27.14 dB/ 25.8 dB 


35.69 dB/ 27.12 dB 


Change 


-0.36 dB / -0.20 dB 


0.4 dB / 1.27 dB 


9.75 dB / — dB 


Heterogeneous 
2 




Coastguard 


Foreman 


Mobile 


Direct 


31.91 dB/ 31.72 dB 


35.16 dB/ 32.75 dB 


21.85 dB/ — dB 


Cooperative 


31.56 dB/ 30.97 dB 


35.72 dB/ 32.39 dB 


26.53 dB / 22.03 dB 


Change 


0.35 dB / -0.75 dB 


0.56 dB / -0.36 dB 


4.68 dB / — dB 
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